Large‐vocabulary spontaneous speech recognition using a corpus of lectures
Identifieur interne : 000245 ( Main/Exploration ); précédent : 000244; suivant : 000246Large‐vocabulary spontaneous speech recognition using a corpus of lectures
Auteurs : Masafumi Nishimura [Japon] ; Nobuyasu Itoh [Japon]Source :
- Electronics and Communications in Japan (Part III: Fundamental Electronic Science) [ 1042-0967 ] ; 2003-08.
English descriptors
- KwdEn :
Abstract
Now that dictation systems have become a reality, research on large‐vocabulary voice recognition is shifting from “read speech” to “spontaneous speech.” The features of this kind of spontaneous speech have been investigated from various perspectives using a corpus of dialog speech. However, in particular for the Japanese language, creating a voice recognition system which has massive amounts of data and is based on statistical methods is not necessarily sufficient. There have been few reports on the performance of a large‐vocabulary voice recognition system whose purpose is tagging spontaneous speech. The authors have prepared a spontaneous speech corpus using as materials lecture speeches from the University of the Air for the purpose of improving the recognition precision for spontaneous speech. They report here on the recognition performance of their large‐vocabulary spontaneous speech recognition system created using this corpus. The results of experiments showed that the word error rate for lecture speeches, which was 51.5% under conventional systems that use reading, was reduced to 16.4%. © 2003 Wiley Periodicals, Inc. Electron Comm Jpn Pt 3, 86(8): 52–60, 2003; Published online in Wiley InterScience (www. interscience.wiley.com). DOI 10.1002/ecjc.10105
Url:
DOI: 10.1002/ecjc.10105
Affiliations:
Links toward previous steps (curation, corpus...)
- to stream Istex, to step Corpus: 000175
- to stream Istex, to step Curation: 000175
- to stream Istex, to step Checkpoint: 000202
- to stream Main, to step Merge: 000266
- to stream Main, to step Curation: 000245
Le document en format XML
<record><TEI wicri:istexFullTextTei="biblStruct"><teiHeader><fileDesc><titleStmt><title xml:lang="en">Large‐vocabulary spontaneous speech recognition using a corpus of lectures</title>
<author><name sortKey="Nishimura, Masafumi" sort="Nishimura, Masafumi" uniqKey="Nishimura M" first="Masafumi" last="Nishimura">Masafumi Nishimura</name>
</author>
<author><name sortKey="Itoh, Nobuyasu" sort="Itoh, Nobuyasu" uniqKey="Itoh N" first="Nobuyasu" last="Itoh">Nobuyasu Itoh</name>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:DBCF4F66E25C7F71FD900AD3DDC1E6095F8ECB93</idno>
<date when="2003" year="2003">2003</date>
<idno type="doi">10.1002/ecjc.10105</idno>
<idno type="url">https://api.istex.fr/document/DBCF4F66E25C7F71FD900AD3DDC1E6095F8ECB93/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">000175</idno>
<idno type="wicri:Area/Istex/Curation">000175</idno>
<idno type="wicri:Area/Istex/Checkpoint">000202</idno>
<idno type="wicri:explorRef" wicri:stream="Istex" wicri:step="Checkpoint">000202</idno>
<idno type="wicri:doubleKey">1042-0967:2003:Nishimura M:large:vocabulary:spontaneous</idno>
<idno type="wicri:Area/Main/Merge">000266</idno>
<idno type="wicri:Area/Main/Curation">000245</idno>
<idno type="wicri:Area/Main/Exploration">000245</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title level="a" type="main" xml:lang="en">Large‐vocabulary spontaneous speech recognition using a corpus of lectures</title>
<author><name sortKey="Nishimura, Masafumi" sort="Nishimura, Masafumi" uniqKey="Nishimura M" first="Masafumi" last="Nishimura">Masafumi Nishimura</name>
<affiliation wicri:level="1"><country xml:lang="fr">Japon</country>
<wicri:regionArea>Tokyo Research Laboratory, IBM Japan, Ltd., Yamato</wicri:regionArea>
<wicri:noRegion>Yamato</wicri:noRegion>
</affiliation>
</author>
<author><name sortKey="Itoh, Nobuyasu" sort="Itoh, Nobuyasu" uniqKey="Itoh N" first="Nobuyasu" last="Itoh">Nobuyasu Itoh</name>
<affiliation wicri:level="1"><country xml:lang="fr">Japon</country>
<wicri:regionArea>Tokyo Research Laboratory, IBM Japan, Ltd., Yamato</wicri:regionArea>
<wicri:noRegion>Yamato</wicri:noRegion>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series><title level="j">Electronics and Communications in Japan (Part III: Fundamental Electronic Science)</title>
<title level="j" type="abbrev">Electron. Comm. Jpn. Pt. III</title>
<idno type="ISSN">1042-0967</idno>
<idno type="eISSN">1520-6440</idno>
<imprint><publisher>Wiley Subscription Services, Inc., A Wiley Company</publisher>
<pubPlace>New York</pubPlace>
<date type="published" when="2003-08">2003-08</date>
<biblScope unit="volume">86</biblScope>
<biblScope unit="issue">8</biblScope>
<biblScope unit="page" from="52">52</biblScope>
<biblScope unit="page" to="60">60</biblScope>
</imprint>
<idno type="ISSN">1042-0967</idno>
</series>
<idno type="istex">DBCF4F66E25C7F71FD900AD3DDC1E6095F8ECB93</idno>
<idno type="DOI">10.1002/ecjc.10105</idno>
<idno type="ArticleID">ECJC10105</idno>
</biblStruct>
</sourceDesc>
<seriesStmt><idno type="ISSN">1042-0967</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass><keywords scheme="KwdEn" xml:lang="en"><term>dictation</term>
<term>disfluency</term>
<term>large‐vocabulary speech recognition</term>
<term>speech corpus</term>
<term>spontaneous speech</term>
</keywords>
</textClass>
<langUsage><language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">Now that dictation systems have become a reality, research on large‐vocabulary voice recognition is shifting from “read speech” to “spontaneous speech.” The features of this kind of spontaneous speech have been investigated from various perspectives using a corpus of dialog speech. However, in particular for the Japanese language, creating a voice recognition system which has massive amounts of data and is based on statistical methods is not necessarily sufficient. There have been few reports on the performance of a large‐vocabulary voice recognition system whose purpose is tagging spontaneous speech. The authors have prepared a spontaneous speech corpus using as materials lecture speeches from the University of the Air for the purpose of improving the recognition precision for spontaneous speech. They report here on the recognition performance of their large‐vocabulary spontaneous speech recognition system created using this corpus. The results of experiments showed that the word error rate for lecture speeches, which was 51.5% under conventional systems that use reading, was reduced to 16.4%. © 2003 Wiley Periodicals, Inc. Electron Comm Jpn Pt 3, 86(8): 52–60, 2003; Published online in Wiley InterScience (www. interscience.wiley.com). DOI 10.1002/ecjc.10105</div>
</front>
</TEI>
<affiliations><list><country><li>Japon</li>
</country>
</list>
<tree><country name="Japon"><noRegion><name sortKey="Nishimura, Masafumi" sort="Nishimura, Masafumi" uniqKey="Nishimura M" first="Masafumi" last="Nishimura">Masafumi Nishimura</name>
</noRegion>
<name sortKey="Itoh, Nobuyasu" sort="Itoh, Nobuyasu" uniqKey="Itoh N" first="Nobuyasu" last="Itoh">Nobuyasu Itoh</name>
</country>
</tree>
</affiliations>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Wicri/Ticri/explor/TeiVM2/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000245 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 000245 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Wicri/Ticri |area= TeiVM2 |flux= Main |étape= Exploration |type= RBID |clé= ISTEX:DBCF4F66E25C7F71FD900AD3DDC1E6095F8ECB93 |texte= Large‐vocabulary spontaneous speech recognition using a corpus of lectures }}
This area was generated with Dilib version V0.6.31. |